Search | Global Index Medicus

FusionScan: accurate prediction of fusion genes from RNA-Seq data

Pora KIM; Ye-Eun JANG; Sanghyuk LEE.

Genomics & Informatics ; : e26-2019.

Article in English | WPRIM | ID: wpr-763821

ABSTRACT

Identification of fusion gene is of prominent importance in cancer research field because of their potential as carcinogenic drivers. RNA sequencing (RNA-Seq) data have been the most useful source for identification of fusion transcripts. Although a number of algorithms have been developed thus far, most programs produce too many false-positives, thus making experimental confirmation almost impossible. We still lack a reliable program that achieves high precision with reasonable recall rate. Here, we present FusionScan, a highly optimized tool for predicting fusion transcripts from RNA-Seq data. We specifically search for split reads composed of intact exons at the fusion boundaries. Using 269 known fusion cases as the reference, we have implemented various mapping and filtering strategies to remove false-positives without discarding genuine fusions. In the performance test using three cell line datasets with validated fusion cases (NCI-H660, K562, and MCF-7), FusionScan outperformed other existing programs by a considerable margin, achieving the precision and recall rates of 60% and 79%, respectively. Simulation test also demonstrated that FusionScan recovered most of true positives without producing an overwhelming number of false-positives regardless of sequencing depth and read length. The computation time was comparable to other leading tools. We also provide several curative means to help users investigate the details of fusion candidates easily. We believe that FusionScan would be a reliable, efficient and convenient program for detecting fusion transcripts that meet the requirements in the clinical and experimental community. FusionScan is freely available at http://fusionscan.ewha.ac.kr/.

Subject(s)

Cell Line , Dataset , Exons , Gene Fusion , Sequence Analysis, RNA , Translocation, Genetic

Prediction of Mammalian MicroRNA Targets : Comparative Genomics Approach with Longer 3' UTR Databases

Seungyoon NAM; Young-Kook KIM; Pora KIM; V-Narry KIM; Seokmin SHIN; Sanghyuk LEE.

Genomics & Informatics ; : 53-62, 2005.

Article in English | WPRIM | ID: wpr-57214

ABSTRACT

MicroRNAs play an important role in regulating gene expression, but their target identification is a difficult task due to their short length and imperfect complementarity. Burge and coworkers developed a program called TargetScan that allowed imperfect complementarity and established a procedure favoring targets with multiple binding sites conserved in multiple organisms. We improved their algorithm in two major aspects - (i) using well-defined UTR (untranslated region) database, (ii) examining the extent of conservation inside the 3' UTR specifically. Average length in our UTR database, based on the ECgene annotation, is more than twice longer than the Ensembl. Then, TargetScan was used to identify putative binding sites. The extent of conservation varies significantly inside the 3' UTR. We used the "tight" tracks in the UCSC genome browser to select the conserved binding sites in multiple species. By combining the longer 3' UTR data, TargetScan, and tightly conserved blocks of genomic DNA, we identified 107 putative target genes with multiple binding sites conserved in multiple species, of which 85 putative targets are novel.

Subject(s)

3' Untranslated Regions , Binding Sites , DNA , Gene Expression , Genome , Genomics , Methods , MicroRNAs

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL